Internet Info 1997 December

home *** CD-ROM | disk | FTP | other *** search

/ Internet Info 1997 December / Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso / ietf / urn / urn-archives / urn-ietf.archive.9608 / 000000_owner-urn-ietf _Thu Aug 1 19:23:01 1996.msg next >

Wrap

Internet Message Format | 1997-02-19 | 15KB

Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id TAA22791 for urn-ietf-out; Thu, 1 Aug 1996 19:23:01 -0400 Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id TAA22782 for <urn-ietf@services.bunyip.com>; Thu, 1 Aug 1996 19:22:57 -0400 Received: from mintaka.lcs.mit.edu by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA23192 (mail destined for urn-ietf@services.bunyip.com); Thu, 1 Aug 96 19:22:54 -0400 Received: from skadhwe.lcs.mit.edu by MINTAKA.LCS.MIT.EDU id aa28913; 1 Aug 96 19:22 EDT Received: by skadhwe.lcs.mit.edu; (5.65/1.1.8.2/15Aug95-0306PM) id AA09845; Thu, 1 Aug 1996 19:22:43 -0400 Date: Thu, 1 Aug 1996 19:22:43 -0400 Message-Id: <9608012322.AA09845@skadhwe.lcs.mit.edu> From: Lewis Girod <girod@LCS.MIT.EDU> To: terry@ora.com Cc: urn-ietf@bunyip.com In-Reply-To: <199607291642.MAA06586@services.bunyip.com> (message from Terry Allen on Sun, 28 Jul 1996 12:32:18 PDT) Subject: Re: [URN] re NAPTR, URN-res-req Sender: owner-urn-ietf@services.bunyip.com Precedence: bulk Reply-To: Lewis Girod <girod@LCS.MIT.EDU> Errors-To: owner-urn-ietf@bunyip.com At 12:32 PM 7/28/96 PDT, Terry Allen wrote: >Much is made here of "hints," which I understand as heuristic instructions >that the client or intermediary can use to determine how to start >the URN resolution process. I think something like this is >essential, but it needs fleshing out. Can a syntax or format >of hints be generalized? (My guess is that it can't.) >Can the semantics of hints be generalized: "for anything >written by author={Jane Austen} see the Jane Austen Home Page"? (I think >this might be possible for well defined bibliographic data.) This was an area in which we were not very specific. In our private discussions while writing this we developed a concept we mean by ``hints'' which we unfortunately forgot to explain clearly in the document. We used the term ``hint'' to mean precisely the data formats used by the system to transmit to the client information that aids in locating a document given a URN. Much of our document was intended to be a set of general issues and requirements that we thought were critical to producing a working URN system; for this reason we did not want to specify formats there. I don't think coming up with formats is too difficult; it is more a question of what clients actually understand. At the time of writing we had envisioned such formats as the proposed NAPTR and SRV records as possible initial `hint' formats, since if the NAPTR system was in use this would be the type of data the clients would be expecting. It should be stated, however, that we were not intending hints to solve the sorts of semantics-oriented search problems you describe; we see this as a higher level of the problem that should be kept separate from the straight URN->document(s) mapping problem. >I may also wish to discover all resolvers for a given URN, because >(especially after the publisher has gone away) some may return a >better quality object than others--remember that "sameness" is defined >by the publisher or namer, and an URN doesn't necessarily resolve >to a unique object. Has this task been considered in either draft? In the models we are considering, this problem would typically be solved using unofficial hints. This is because there is an assumption that the URN in question falls into the domain of some naming authority, and they are providing (or not) official hints as to where the URN can be resolved. They may not provide or even be aware of all relevant resolvers; for this reason collections of unofficial hints may be provided by anyone that can direct people to alternate resolvers. In the long run unofficial hints are less efficient to maintain and are in general less certain (the models we consider do not guarantee that unofficial hints are provided with the accuracy accorded to official hints), so they should be considered to be temporary fixes for problematic situations. In answer to your question, in such a system all advertised resolvers would be determinable. >But in time, as NAs disappear, sometimes leaving large messes behind >them, there will be URNs for which no single authoritative resolver >exists; this problem is only compounded for hints, which offer fertile >... >Thus much of section 4.1 is moot. One cannot require that there be >authoritative URN resolution services; there will be simply services. >Whether they work for you or not is something you'll have to do your >homework on. The hints you'll find most valuable may be those >supplied by third parties, or discovered by your own desktop client >based on experience; and there may be no correlation with their >supposed authoritativeness. These concerns are definitely something to think about. In terms of naming authorities disappearing, I had envisioned the solution being for the naming authority to transfer or sell off its namespace to other entities before going away. For example, if you have some names issued by shady.com, they might sell them to you before they go out of business, at which point you would be responsible for telling the resolution system where your authoritative server is. In our model ``authoritative server'' does not imply much of anything in terms of authenticity -- it merely is the server that the naming authority owning the names says is authentic. This is another place where unofficial hints come into play; an unofficial hint may give information that contradicts the ``official'' information from the ``official'' naming authority. In the end it is up to the user to decide who is telling the truth; it is quite possible that the unofficial hints are written by the author of the eventual resource (and authenticatable) but to know that you need to know who the author is, etc. Note that our ideas of having unofficial hints integrated into the system do not preclude third party hint systems! In fact third party hint systems are a really good idea, but from my perspective it seems best to also try to integrate them into the official infrastructure in the interests of fairness (it may not succeed, of course... :-) ). Another issue that is parallel to this is that the resolution models being discussed here all try to deal with URNs in groups in order to cut down on the size of higher levels of the database. NAPTR does this in a similar way to the DNS. The model I tend to think of tends to assume that the top level of the directory is pretty big, but still is based on the idea of sending all URNs starting with this prefix to this particular resolver. In the end this is a good idea because it allows changes to be made at a local level rather than having to bother the top level every time. This means that the system we are talking about does not deal with individual URNs under normal circumstances; they will always be generalized somehow, in my mind usually through the concept of a naming authority. Note also that systems set up this way cannot deal with the semantic content of URNs because that radically changes the way they are hierarchicalized (For example, sorting on author and on title produce two very different search trees...) Since the data they are storing is really just a description of generalization, these systems always deal with just one hierarchy and stick to it. >4.1.2 describes as necessary a Hint Passing Interface, and says >that hints lie within given naming schemes. Gosh, this seems >simplistic. Why is this a requirement for URN resolution? Has >anything of the sort been attempted? Why must hints be restricted >to the NA vector ("anything written by author={Jane Austen}")? >This is not the way we locate most bibliographic information today. It should be noted that section 4 describes some ideas we had for a model or framework for URNs, more than it states requirements. The requirements we wanted to set are in section 2. >BTW, who owns the URN? the top-level NA, or the NA responsible for >actually associating the URN with something? If I offer resolution >for your URNs, can you object? especially if the URN denotes an >object neither of us owns rights in or that cannot be retrieved? >Pseudoexamples: >FPI:ISO:+ISO/IEC 8070/RA::A00002/Athens/Acropolis/Erectheum >PP:944940/Domesday Book/page 18 >USArchives:Nixon/tape/18.5-min-gap Interesting question.. in the end I suppose this is a legal issue, but in my opinion I would say no. Just because a URN is resolved to a location does not imply that (a) the location serves it to you, or (b) the location admits that it exists. I suppose that certain types of resolution process might be proprietary, and in that case executing the algorithm might be illegal even if it was independently derived. I have never agreed with this, but that probably has little bearing on reality. >4.2.1 "There is a need for an effective model of name delegation." >This form of words, "there is a need", is becoming all too common. >What is the need? if the need is not satisfied (and it never >will be fully satisfied in practice), does URN resolution become >impossible or merely messy? Well, to the degree that the need is not satisfied things may get messy and/or people may get annoyed. I don't think it is the end of the world, nor is it entirely avoidable, but I do believe that certain design choices can make a difference. In this section I was trying to analyse exactly what people want out of a name and how that interferes with the desired property of longevity. The essential problem is that in the short term people dont generally care about longevity so they do things that may screw them later. The fact is that if what gets built is pretty much good enough, people will use it even if it doesn't really do quite what they want, especially if it is the first thing that came along and everybody's browser understands it. For that reason I think that setting a few design limitations that happen to prevent certain types of predictable disaster might be a good thing overall. >4.2.2 states some obvious things that will be under the control of >NAs, which will do as they please. The behavior of NAs is not >within scope for the IETF, so long as they don't overlap each >other's namespaces. True, but these constraints only affect them if they choose to utilize the system that is being described in section 4. If they want to do something else, they are free to escape to their own entirely separate and unregulated resolution system. (Note that this is a problem for NAPTR as well, although it is not explicitly stated. There are namespaces which do not lend themselves to NAPTR-style decomposition and thus would need to be escaped.) The question is, how restrictive is this really? I have been thinking a lot about this and have found little in the way of problematic counterexamples. >But this admittedly "vague sketch" of strategies is simply too vague, >as it relies on the concept of hints but doesn't describe them, and >ignores the power of supplying as input into the URN resolution system >ancillary information about the URN (hints in reverse, so to speak). True. Our omission of an explanation of hints has caused a great deal of confusion, which this reply has hopefully addressed to a degree. Since we did not have a clear technical plan at the time of writing we did not want to do into a lot of detail; we hoped to instead get out some ideas we were plaing with and see what sort of reactions were generated. As for the idea of `hints in reverse', the URN community should decide if this is within the scope of our problem or whether we want to declare it outside. My opinion is that it is outside; it widens the problem and makes the primary technique for resolution (i.e. precomputed lexical generalization of URNs) impossible. In any case, thanks a lot for the comments. Yours are the first to address any of the ideas in section 4. I have been continuing to work on those ideas, in particular the stuff relating to canonical forms. Following this message I am forwarding a pointer to a small explanation of how various naespaces might be canonicalized on the client side enough to be taken apart by very simple rules. The canonicalization is done by a simple translation program that takes a little ``program'' as input. The idea is to rearrange the URN at the beginning so that the hierarchy goes in one consistent direction; the theory is that after the name is canonicalized the rest of the resolution can be done with very simple ``rules''. Essentially, where NAPTR allows arbitrary rewrite patterns, the simplified rules allow only a few very simple moves, removing characters from the front of the canonicalized string and copying them or replacing them with specified strings. This way most of the data driving the resolution are in this simplified format, which makes it easier to handle and easier to specify from above. Under such a system, a name scheme designer can enforce certain policies about the format and semantics of the names under that scheme, whereas according to the original NAPTR model the top-level name scheme design had no impact on the structure underneath. Such a modified NAPTR resolution mechanism will be able to resolve a very specific subset of all possible namespaces, and the rest will be escaped to other resolution mechanisms. The main difference here (as far as I can tell) is that using the simplified rules it becomes more obvious what naming schemes are possible and which are not. It is not clear that this modification results in any significant reduction in the universe of resolvable name schemes. In fact, NAPTR is not well suited to many types of namespaces. For example, a namespace of MD5 digests of documents could not efficiently be resolved by NAPTR, nor can it be efficiently resolved by this modified system. One name space that the original NAPTR plan can resolve that the modified one can't is a namespace in which there is no pre-defined hierarchy. For example, generalization can be based on flags: URN:flags:girod#paper.ps#about-translation#about-urns contains several #-delimited flags that could be matched (for example, all URNs of that type containing my name go to my resolver, etc.) However, for a number of reasons, this is not currently a useful way of structuring URNs, and if it becomes useful it seems reasonable to build a separate resolution mechanism.